<html>
<head><meta charset="utf-8"><title>Perf reproducibility · t-compiler · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/index.html">t-compiler</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html">Perf reproducibility</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="194673591"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194673591" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194673591">(Apr 20 2020 at 13:22)</a>:</h4>
<p>Hello!<br>
For a work related project, we want to achieve something very similar to <a href="http://perf.rust-lang.org" title="http://perf.rust-lang.org">perf.rust-lang.org</a> where we can see performance improvements / regressions over time.<br>
But i wonder how did you manage to have a setup the give reproducible results?</p>



<a name="194673686"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194673686" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194673686">(Apr 20 2020 at 13:23)</a>:</h4>
<p>like overcome frequency scaling properly etc<br>
because from my own experiences, its very hard to have something that you compare from run to run</p>



<a name="194673742"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194673742" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194673742">(Apr 20 2020 at 13:23)</a>:</h4>
<p>perf is not very reproducible :)</p>
<p>It's been an ongoing project to reduce variability -- currently, we have ASLR (both kernel and user space) disabled and that's pretty much it</p>
<p>I have core pinning, frequency scaling, etc. planned as things to look at</p>



<a name="194673764"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194673764" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194673764">(Apr 20 2020 at 13:23)</a>:</h4>
<p>but for the most part we just go for instruction counts, which are mostly reliable</p>



<a name="194677113"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677113" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677113">(Apr 20 2020 at 13:49)</a>:</h4>
<p>Ok! Locally (ie: on my laptop) i've isolated 1 physical core (2 threads) and i've removed as much irq as i can etc</p>



<a name="194677168"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677168" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677168">(Apr 20 2020 at 13:49)</a>:</h4>
<p>But even with that, instruction cache can make things very noisy</p>



<a name="194677253"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677253" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677253">(Apr 20 2020 at 13:50)</a>:</h4>
<p>Also laptop cpus are totally unpredictable</p>



<a name="194677320"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677320" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677320">(Apr 20 2020 at 13:50)</a>:</h4>
<p>Can I ask on which kind of machine do you run those tests? And in which environment? Like containers etc?</p>



<a name="194677890"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677890" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677890">(Apr 20 2020 at 13:55)</a>:</h4>
<p>no containers, this is on a cloud provided machine (AMD Ryzen 5 3600 6-Core Processor)</p>



<a name="194677928"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194677928" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194677928">(Apr 20 2020 at 13:55)</a>:</h4>
<p>in theory the machine is not shared though</p>



<a name="194678061"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678061" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678061">(Apr 20 2020 at 13:56)</a>:</h4>
<p>Yeah ok that was my next question. So in theory, this machine is meant to run only rusct perf tests</p>



<a name="194678193"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678193" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678193">(Apr 20 2020 at 13:57)</a>:</h4>
<p>that's the only thing <em>we</em> use it for (and IIRC, the cloud provider claims we have dedicated hardware, but I forget)</p>



<a name="194678457"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678457" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678457">(Apr 20 2020 at 13:59)</a>:</h4>
<p>yep, hetzner claims it's a dedicated server</p>



<a name="194678517"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678517" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678517">(Apr 20 2020 at 14:00)</a>:</h4>
<p>I run <code>echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor </code> to ensure that the cpu frequency changes as little as possible. You may want to set it to <code>powersave</code> afterwards.</p>



<a name="194678684"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678684" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678684">(Apr 20 2020 at 14:01)</a>:</h4>
<p>Did disabling ASLR really made a difference? For reproducibility?</p>



<a name="194678747"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678747" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678747">(Apr 20 2020 at 14:02)</a>:</h4>
<p>Also is there some kind of normalization step of your metrics / numbers? Or, if you change hardware, then all the data becomes irrelevant?</p>



<a name="194678883"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194678883" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194678883">(Apr 20 2020 at 14:02)</a>:</h4>
<p>I believe all data was removed when the hardware changed the last time.</p>



<a name="194679108"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679108" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679108">(Apr 20 2020 at 14:04)</a>:</h4>
<p>Note that that due to numerous microarchitectural features in modern CPUs, wall clock time (and other metrics correlated with it) can vary greatly depending on details of the overall code and data layout (e.g. sizes as well as relative and absolute addresses of individual code sections and heap/stack allocations) that are almost guaranteed to be perturbed as side effect of essentially any code change. So it's questionable if "stable" performance measurements across code changes that should be performance-neutral are even possible to achieve.</p>



<a name="194679231"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679231" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679231">(Apr 20 2020 at 14:05)</a>:</h4>
<p>yeah, we're mostly aiming for stable instruction counts (or as much as possible)</p>



<a name="194679275"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679275" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679275">(Apr 20 2020 at 14:05)</a>:</h4>
<p>ah, yes, we also set scaling governor to performance, forgot about that</p>



<a name="194679368"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679368" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679368">(Apr 20 2020 at 14:06)</a>:</h4>
<p>yeah I don't think anything we've done on the machine itself has had appreciable impact on reproducibility -- there's been some changes to the compiler itself or how we build it, but not beyond that</p>



<a name="194679428"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679428" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679428">(Apr 20 2020 at 14:07)</a>:</h4>
<p><span class="user-mention" data-user-id="124289">@Hanna Kruppe</span> yep  I understand, but still I need to monitor perf for critical projects, and i wondered how people do it</p>



<a name="194679566"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194679566" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194679566">(Apr 20 2020 at 14:08)</a>:</h4>
<p>And ideally i'd like to understand if a change has an impact or not</p>



<a name="194680057"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194680057" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194680057">(Apr 20 2020 at 14:12)</a>:</h4>
<p>Well, the answer is they mostly don't address this problem :) Either side-stepping it by looking at more stable but limited measures like instruction count or something higher-level, or trying to remove as much variance as possible (some techniques discussed above) and then trying to account for the remaining "noise" with rules of thumb, often with questionable results IMO. The only rigorous approach I know of it stabilizer <br>
(&lt;<a href="https://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf" title="https://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf">https://people.cs.umass.edu/~emery/pubs/stabilizer-asplos13.pdf</a>&gt;) but as far as I can tell it hasn't achieved much adoption.</p>



<a name="194680445"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194680445" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194680445">(Apr 20 2020 at 14:15)</a>:</h4>
<p>Thank you for input and to the paper, i'll read and share that :)</p>



<a name="194708101"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/131828-t-compiler/topic/Perf%20reproducibility/near/194708101" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> marmeladema <a href="https://rust-lang.github.io/zulip_archive/stream/131828-t-compiler/topic/Perf.20reproducibility.html#194708101">(Apr 20 2020 at 17:32)</a>:</h4>
<p><span class="user-mention" data-user-id="124289">@Hanna Kruppe</span> i gave a quick read to the paper, its interesting. I lack the knowledge in statistiscs though^^ Its sad that <a href="https://github.com/ccurtsinger/stabilizer" title="https://github.com/ccurtsinger/stabilizer">https://github.com/ccurtsinger/stabilizer</a> is unmaintained :(</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>